SEQUENCE LISTING 



(1) GENERAL INFORMATION 

(i) APPLICANT: Hadlaczky, Gyula 
Szalay, Aladar 

(ii) TITLE OF INVENTION: ARTIFICIAL CHROMOSOMES , 
METHODS PREPARING ARTIFICIAL CHROMOSOMES 

(iii) NUMBER OF SEQUENCES: 12 

(iv) CORRESPONDENCE ADDRESS: M cClain 

(A) ADDRESSEE: Brown, Martin, tiaxxex 

(B) STREET: 1660 Union Street 

(C) CITY: San Diego 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 92101-2926 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS _ 

(D) SOFTWARE: FastSEQ Version 1.5 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/629,822 

(B) FILING DATE: 10-APR-1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Seidman, Stephanie L 

(B) REGISTRATION NUMBER : 33 , 7 7 9 

(C) REFERENCE/DOCKET NUMBER: 686 9-402A 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619-238-0999 

(B) TELEFAX: 619-238-0062 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1293 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 



(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

GAATT CAT C A TTTTTCANGT CCTCAAGTCG ATaTTTCTC* TTTj™ TTNCAGTGA.T 

TCTCGCCATA TTCCTGGTCC TACAGTGTGC ^TTTCTCCAI ^ CAGTTTTCT N 

TTCGTCATTT TCAAGTCCTC AAGTGGATGT TTCTCAT1 1 qaCCTTTTTC AGTTTTCCTC 

GCCATATTCC ACGTCCTACA JNGGACATTT CTAAATI i G TGATTTTCA GTTTTCTCGC 

GCCATATTTC ACGTCCTAAA ATGTGTATTT ™™ QTTTTTCAGT GATTTCGTCA 

C AG AT T C C AG GTCCTATAAT ^TGCATTTCT ^ATII^ ATTTNCAGTT TTCTTGNAAT 

TTTTTTCAAG TCGGCAAGTG GATGTTTCTC ATllNLtm TTCATTTTTC CACGCCATAT 

Ittccatgtc ctacaatgat catttttaat tttccacctt ttcatT ttct cgccatattc 

TTCATGTCCT AAAGTGTATA ^TTCTCCTTT JCCGCGATTT CQ TCATTTTTC A 

CAGGTCCTAC AGTGTGCATT JCTCATTTTT CACCTT1 1 TTATCTTGTC ATATTCCATG 
AGTCGTCAAC TGGATCTTTC TAATTTTCCA TGATTTICA^ ATATTTGACG 
TCCTACAGTG GACATTTCTA AATTTTCCAA CTTTTTCAAT ^ ATTCCAGGTC 
TGCTAAAGTG TGTATTTCTT ATTTTCCGTG ^TTTTCAGTT ^ TTTTCCAGTT 
CTAATAGTGT GCATTTCTCA TTTTTCACGT TTTTCAGTGA AT TCCATGTC CT 

GT C AAGGGGA TGTTTCTCAT TTTCCATGAG TGTCAGTTTT ATATT -pCACGTCCTA 

ACAGTGACAT TTCTAAATAT JATACCTTTT TCAGT1 1 1 GCCATATTCC AGGTCCTACA 
AAGTATATAT TTCTCATTTT JCCTGATTTT CAGTTTCL1 1 ^ GCCCTCA AAT 

GTGTGCATTT CTCATTTTTC ACGTTTTTCA GTAAT1 1U TAT AC CATGT CCTACAGTGG 
GGATGTTTCT CATTTTCCAT GATTTTCAGT ^TTCTTGCCA ™T CQ TCCTAAAGT G 

a™SS SSSSS? t?Scgccat attccaggac ctacagtgtg 

CATTTCTCAT TTTTCACGTT TTTCAGTGAA TTC 



(2) INFORMATION FOR SEQ ID NO : 2 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1044 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1293 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
AGGCCTATGG TGAAAAAGGA AAT AT C TT C C CCTGAAAACT AGACAGAAGG ATTCTCAGAA 

TCTTATTTGT GATGTGCGCC JCTCAACTAA CAGTGTTGAA GgTTTC tgaqgatttc 

TTTTGAAACA CTCTTTTTGT AAAATCTGCA AGAGGATA1 1 GAAGCTTCAT 

CGTTGGAAAC GGGATTGTCT T C AT AT AAAC ^TAGACAGA AGCA^ CAGGTTTGAA 
TGGGATGTTT CAGTTGAAGT CACAGTGTTG AACAGTCCCC JTTCAT 

ACACTCTTTT TTGTAGTATC TGGAAGTGGA CATTTGGAGC ^ TTTTCATGAT 
AAAGGAAATA TCTTCCAATA AAAGCTAGAT AGAGGCAAl^ x Q aaacaCTCTT 

GTATCTACTC AGCTAACAGA GTTGAACCTT JCTTTGAGAG JGCAGT GAAACGGGAT 

TTTGTGGAAT CTGCAAGTGG ATATTTGTCT AGCTTTGAGG TGCATTCAAG 

TACATATAAA AAGCAGACAG CAGCATTCCC AGAAACTTCT ™TGA TGATGTATCT 

SiSSES S??™ JS™^ ttcccctgaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



»™»C« *-™C GCAGTTTTGA SSSS SSSS 

GAAGCTTTCT TTTGATAGAG GCAGTTTTGA AACAULU ATATAAAAAG CAGACAGCAG 
ATTTGTCTAG CTTTGAGGAT TTCTTTGGAA ^GGGATTAC AT ^ CATTCCCTTT 

SgIcS GTTTGAA.CAC T^TTATA T GTGGACATTT GGAGCGCTTT 1020 

CAGGGGGGAT CCTCTAGAAT TCCT 



780 
840 
900 
960 



(2) INFORMATION FOR SEQ ID NO : 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
CTGCAGCTGG GGGTCTCCAA TCAGGCAGGG GCCCCTTACT ACTCAGATGG GGTGGCCGAG 
TAGGGGAAGG GGGTGCAGGC TGCATGAGTG ^ACACAGCTG CAA GGTGGCCT GA 

GGATCTATGG GGGTGGGGAG AAGCCCAGTG ^CAGTGCC^ QCA GGAAGT GAGG 

GAGGGTCTGA GGAACATAGA GCTGGCCATG TTGOG^^ TAGCAAGGAG GGCTTGGGGT 
AATGGGACAG GCTTGAGGAT ACTCTACTCA GTAGCCAGGA ™^ CCCAGGACAT 
TGCTATCCTG GGGTTCAACC CCCCAGGTTG AAGGCCCI^ cc ATGCCA AGAG 

ATTACAATGG ACACAGGAGG TTGGGACACC JGGAGTCACC ^ T C AAAT C CAA 

AGACCATGAG TAGGGGTGTC CAGTCCAGCC CTCibALibH qaCCCTGGGC CACACGCGTT 
AGGGCCCCTG CTGCCACCTA JTGGCTGATG ^ATCCACAT gACCCTG^ TGAG C AGAG A 
TAGGGTCTCT GTGAAGACCA AGATCCTTGT TALAl AAAAAGCCTG GGGGATGGCA 
TTTCCACCTA TTCGAAACAA T C AC AT AAAA TCCATCCTGG TAGGGTTAGG 
CTAAGGCTAG GGATAGGGTG GGAT GAAGAT TATAGTTACA 6TAAGG6 GGGGTTAGGG 
GATCAACGTT GGTTAGGAGT TAGGGATACA GTAGGGTACC GTTAGGGTTA 
TTAGGGGTTA GGGTTAGGGT TAGGGTTAGG ^TAGGCTTA ^ GG C AATG AAA 

GGGTTAGGTT TTGGGGTGGC GTATTTTGGT CTTATACGCT ^ ^^CA 

AGAGTTCTTG TTTTTCCTTC AGCAATTTGT CATT1 1 iAAA TGTGGTTTCA 
G AT AT AGAC C AGCTGTGCTA TCTCATTGTG GTTTTCAATT £ GATGTGTGTG 
ATGTGTTTAC TTGCCATCTG TAG AT CTT C T TTGCblbi^b TAATAATTTT TTATATATTT 
CATTTCTTGN NTTTNGGCTG TTTAACTTAT JGTTTAGTTT TAATAAT^ gqcttgcttt 
GAAGACAAAT CTTTCTCAGA TGTGTATTTG CAAATA1 l it CCACA CTGTCAC TTC 

TGTCTCTAAC AAGGTCTCTT CAGAGATAAC TTAAATA1AA c CCAAAGGC AG 

TTTTGTGTAT ATCTACCTTT TGTGTCATTT GTTAAAATTC £ TTAGTGTAAG 
ATAGCTTTTC TTCTATTGTT TCTTCTAGAA ATTTGTATAG TT CAT AT CATTT 

GATGATTTTG AGTGATTATT TGTGTAAGTT GTAAAGTT ACACAG GATAGT GGGC 

CTTATGGTTT CCAATTAATC GTTCCCTCAC ^^t,t,tz rnCCTCCTGG AAAAGGGAAA 

Stgttagag tagataggta gctagacatg ^gaggg ggcctcctgg atctctagtg 

GTCTGGGAAG GCTCACCTGG AGGACCACCA JA^TIOAUA c TATGCAGAAA 

CTGGAGTGGA TGGGCACTTG TCAATTGTGG GTAGGAGGGA ^ AAAAATGTCA 
GAAACTCCCT AGAACTCCTC TGAAGATGCC CCAATCATTC ^CTCl^ aaacCTGGCA 
GAATATTGCT AGCTACATGC TGATAAGGNN AAAGGGGACA JTCTI^^ QTAGGTACAA 
ACGTGATCGC T^fcS CC^S SSSSSS GCTGGTGCCA GAGTGGATT C 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
18 6 0 
1920 






1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2492 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 



(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ 



ffl GGGGAATTCA TTGGGATGTT TCAGTTGA 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID : 



CGAAAGTCCC CCCTAGGAGA TCTTAAGGA 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(iv) ANT I SENSE: NO 



28 



29 



(ii) MOLECULE TYPE: RNA 



(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
CCGCTTAATA CTCTGATGAG TCCGTGAGGA CGAAACGCTC TCGCACC 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

25 

CGATTTAAAT TAATTAAGCC CGGGC 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

27 

TAAATTTAAT T AATT CGGGC CCGTCGA 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 

(D) OTHER INFORMATION IL-2 signal sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

ATG TAC AGG ATG CAA CTC CTG TCT TGC ATT GCA CTA ACT CTT GCA CTT 
Set Tyr Arg Met Gin Leu Leu Ser Cys He Ala Leu Ser Leu Ala Leu 

GTC ACA AAC AGT GCA CCT ACT 
Val Thr Asn Ser Ala Pro Thr 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 945 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1. . .942 

(D) OTHER INFORMATION: Renilla Reinformis Lucif erase 
(x) PUBLICATION INFORMATION: 
PATENT NO.: 5,418,155 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACC TTA AAG ATG ACT TCG AAA GTT TAT GAT CCA GAA CAA AGG AAA CGG 
Ser ™ £y*s nit ?hr Ser Lys Val Tyr Asp Pro Glu Gin Arg Lys Arg 
1 5 10 

ATG ATA ACT GGT CCG CAG TGG TGG GCC AGA TGT AAA CAA ATG AAT GTT 
Se? lie Thr Gly Pro Gin Trp Trp Ala Arg Cys Lys Gin Met Asn Val 
20 25 30 

CTT GAT TCA TTT ATT AAT TAT TAT GAT TCA GAA AAA CAT GCA GAA AAT 
Su Asl Ser p£e He Asn Tyr Tyr Asp Ser Glu Lys His Ala Glu Asn 



35 



40 



CCT GTT ATT TTT TTA CAT GGT AAC GCG GCC TCT TCT TAT TTA TGG CGA 
aS Sal III lie Leu His Gly Asn Ala Ala Ser Ser Tyr Leu Trp Arg 
50 55 60 

TQ CCA CAT ATT GAG CCA GTA GCG CGG TGT ATT ATA CCA GAT 
Kis vll vll Pro Sb He Glu Pro Val Ala Arg Cys He He Pro Asp 
65 70 75 

CTT ATT GGT ATG GGC AAA TCA GGC AAA TCT GGT AAT GGT TCT TAT AGG 
Leu He Gly Met Gly Lys Ser Gly Lys Ser Gly Asn Gly Ser Tyr Arg 

85 90 



48 



69 



48 



96 



144 



192 



240 



288 



TTA CTT GAT CAT TAC AAA TAT CTT ACT GCA TGG TTG AAC TTC TTA ATT 
leu Leu Asp His Tyr Lys Tyr Leu Thr Ala Trp Leu Asn Phe Leu He 
100 105 110 

TAC CAA AGA AGA TCA TTT TTT GTC GGC CAT GAT TGG GGT GCT TGT TTG 
Tyr Gin Arg Arg Ser Phe Phe Val Gly His Asp Trp Gly Ala Cys Leu 
115 120 125 

GCA TTT CAT TAT AGC TAT GAG CAT CAA GAT AAG ATC AAA GCA ATA GTT 
Ala Phe His Tyr Ser Tyr Glu His Gin Asp Lys He Lys Ala He Val 
130 * 135 140 

CAC GCT GAA AGT GTA GTA GAT GTG ATT GAA TCA TGG GAT GAA TGG CCT 
Kis 111 Glu Ser Val Val Asp Val He Glu Ser Trp Asp Glu Trp Pro 
145 150 155 -L bu 

GAT ATT GAA GAA GAT ATT GCG TTG ATC AAA TCT GAA GAA GGA GAA AAA 
Asp He Glu Glu Asp He Ala Leu He Lys Ser Glu Glu Gly Glu Lys 

165 170 175 

ATG GTT TTG GAG AAT AAC TTC TTC GTG GAA ACC ATG TTG CCA TCA AAA 
Met Val Leu Glu Asn Asn Phe Phe Val Glu Thr Met Leu Pro Ser Lys 
180 185 19° 

ATC ATG AGA AAG TTA GAA CCA GAA GAA TTT GCA GCA TAT CTT GAA CCA 
He Met Arg Lys Leu Glu Pro Glu Glu Phe Ala Ala Tyr Leu Glu Pro 
195 200 205 

TTC AAA GAG AAA GGT GAA GTT CGT CGT CCA ACA TTA TCA TGG CCT CGT 
Phe Lys Glu Lys Gly Glu Val Arg Arg Pro Thr Leu Ser Trp Pro Arg 
210 " 215 220 

GAA ATC CCG TTA GTA AAA GGT GGT AAA CCT GAC GTT GTA CAA ATT GTT 
Glu He Pro Leu Val Lys Gly Gly Lys Pro Asp Val Val Gin lie Val 
225 230 235 

AGG AAT TAT AAT GCT TAT CTA CGT GCA AGT GAT GAT TTA CCA AAA ATG 
Arg Asn Tyr Asn Ala Tyr Leu Arg Ala Ser Asp Asp Leu Pro Lys Met 

245 250 

TTT ATT GAA TCG GAT CCA GGA TTC TTT TCC AAT GCT ATT GTT GAA GGC 
Phe He Glu Ser Asp Pro Gly Phe Phe Ser Asn Ala He Val Glu Gly 
260 ^ 265 270 

GCC AAG AAG TTT CCT AAT ACT GAA TTT GTC AAA GTA AAA GGT CTT CAT 
Ala Lys Lys Phe Pro Asn Thr Glu Phe Val Lys Val Lys Gly Leu Hxs 
275 280 285 

TTT TCG CAA GAA GAT GCA CCT GAT GAA ATG GGA AAA TAT ATC AAA TCG 
Phe Ser Gin Glu Asp Ala Pro Asp Glu Met Gly Lys Tyr He Lys Ser 
290 295 300 

TTC GTT GAG CGA GTT CTC AAA AAT GAA CAA TAA 
Phe Val Glu Arg Val Leu Lys Asn Glu Gin 
305 310 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



945 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
TTTGAATTC A TGTACAGGAT GCAACTCCTG 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 



TTTGAATTCA GTAGGTGCAC TGTTTGTCAC 



